This paper highlights vulnerabilities of deep learning-driven semantic communications to backdoor (Trojan) attacks. Semantic communications aims to convey a desired meaning while transferring information from a transmitter to its receiver. An encoder-decoder pair that is represented by two deep neural networks (DNNs) as part of an autoencoder is trained to reconstruct signals such as images at the receiver by transmitting latent features of small size over a limited number of channel uses. In the meantime, another DNN of a semantic task classifier at the receiver is jointly trained with the autoencoder to check the meaning conveyed to the receiver. The complex decision space of the DNNs makes semantic communications susceptible to adversarial manipulations. In a backdoor (Trojan) attack, the adversary adds triggers to a small portion of training samples and changes the label to a target label. When the transfer of images is considered, the triggers can be added to the images or equivalently to the corresponding transmitted or received signals. In test time, the adversary activates these triggers by providing poisoned samples as input to the encoder (or decoder) of semantic communications. The backdoor attack can effectively change the semantic information transferred for the poisoned input samples to a target meaning. As the performance of semantic communications improves with the signal-to-noise ratio and the number of channel uses, the success of the backdoor attack increases as well. Also, increasing the Trojan ratio in training data makes the attack more successful. In the meantime, the effect of this attack on the unpoisoned input samples remains limited. Overall, this paper shows that the backdoor attack poses a serious threat to semantic communications and presents novel design guidelines to preserve the meaning of transferred information in the presence of backdoor attacks.
translated by 谷歌翻译
Semantic communications seeks to transfer information from a source while conveying a desired meaning to its destination. We model the transmitter-receiver functionalities as an autoencoder followed by a task classifier that evaluates the meaning of the information conveyed to the receiver. The autoencoder consists of an encoder at the transmitter to jointly model source coding, channel coding, and modulation, and a decoder at the receiver to jointly model demodulation, channel decoding and source decoding. By augmenting the reconstruction loss with a semantic loss, the two deep neural networks (DNNs) of this encoder-decoder pair are interactively trained with the DNN of the semantic task classifier. This approach effectively captures the latent feature space and reliably transfers compressed feature vectors with a small number of channel uses while keeping the semantic loss low. We identify the multi-domain security vulnerabilities of using the DNNs for semantic communications. Based on adversarial machine learning, we introduce test-time (targeted and non-targeted) adversarial attacks on the DNNs by manipulating their inputs at different stages of semantic communications. As a computer vision attack, small perturbations are injected to the images at the input of the transmitter's encoder. As a wireless attack, small perturbations signals are transmitted to interfere with the input of the receiver's decoder. By launching these stealth attacks individually or more effectively in a combined form as a multi-domain attack, we show that it is possible to change the semantics of the transferred information even when the reconstruction loss remains low. These multi-domain adversarial attacks pose as a serious threat to the semantics of information transfer (with larger impact than conventional jamming) and raise the need of defense methods for the safe adoption of semantic communications.
translated by 谷歌翻译
Communications systems to date are primarily designed with the goal of reliable (error-free) transfer of digital sequences (bits). Next generation (NextG) communication systems are beginning to explore shifting this design paradigm of reliably decoding bits to reliably executing a given task. Task-oriented communications system design is likely to find impactful applications, for example, considering the relative importance of messages. In this paper, a wireless signal classification is considered as the task to be performed in the NextG Radio Access Network (RAN) for signal intelligence and spectrum awareness applications such as user equipment (UE) identification and authentication, and incumbent signal detection for spectrum co-existence. For that purpose, edge devices collect wireless signals and communicate with the NextG base station (gNodeB) that needs to know the signal class. Edge devices may not have sufficient processing power and may not be trusted to perform the signal classification task, whereas the transfer of the captured signals from the edge devices to the gNodeB may not be efficient or even feasible subject to stringent delay, rate, and energy restrictions. We present a task-oriented communications approach, where all the transmitter, receiver and classifier functionalities are jointly trained as two deep neural networks (DNNs), one for the edge device and another for the gNodeB. We show that this approach achieves better accuracy with smaller DNNs compared to the baselines that treat communications and signal classification as two separate tasks. Finally, we discuss how adversarial machine learning poses a major security threat for the use of DNNs for task-oriented communications. We demonstrate the major performance loss under backdoor (Trojan) attacks and adversarial (evasion) attacks that target the training and test processes of task-oriented communications.
translated by 谷歌翻译
迄今为止,通信系统主要旨在可靠地交流位序列。这种方法提供了有效的工程设计,这些设计对消息的含义或消息交换所旨在实现的目标不可知。但是,下一代系统可以通过将消息语义和沟通目标折叠到其设计中来丰富。此外,可以使这些系统了解进行交流交流的环境,从而为新颖的设计见解提供途径。本教程总结了迄今为止的努力,从早期改编,语义意识和以任务为导向的通信开始,涵盖了基础,算法和潜在的实现。重点是利用信息理论提供基础的方法,以及学习在语义和任务感知通信中的重要作用。
translated by 谷歌翻译
本文提出了一种新的方法,用于可重新配置智能表面(RIS)和发射器 - 接收器对的联合设计,其作为一组深神经网络(DNN)培训,以优化端到端通信性能接收者。 RIS是一种软件定义的单位单元阵列,其可以根据散射和反射轮廓来控制,以将来自发射机的传入信号集中到接收器。 RIS的好处是通过克服视线(LOS)链路的物理障碍来提高无线通信的覆盖率和光谱效率。 RIS波束码字(从预定义的码本)的选择过程被配制为DNN,而发射器 - 接收器对的操作被建模为两个DNN,一个用于编码器(在发射器)和另一个一个用于AutoEncoder的解码器(在接收器处),通过考虑包括由in之间引起的频道效应。底层DNN共同训练,以最小化接收器处的符号误差率。数值结果表明,所提出的设计在各种基线方案中实现了误差性能的主要增益,其中使用了没有RIS或者将RIS光束的选择与发射器 - 接收器对的设计分离。
translated by 谷歌翻译
Nine language-vision AI models trained on web scrapes with the Contrastive Language-Image Pretraining (CLIP) objective are evaluated for evidence of a bias studied by psychologists: the sexual objectification of girls and women, which occurs when a person's human characteristics are disregarded and the person is treated as a body or a collection of body parts. A first experiment uses standardized images of women from the Sexual OBjectification and EMotion Database, and finds that, commensurate with prior research in psychology, human characteristics are disassociated from images of objectified women: the model's recognition of emotional state is mediated by whether the subject is fully or partially clothed. Embedding association tests (EATs) return significant effect sizes for both anger (d >.8) and sadness (d >.5). A second experiment measures the effect in a representative application: an automatic image captioner (Antarctic Captions) includes words denoting emotion less than 50% as often for images of partially clothed women than for images of fully clothed women. A third experiment finds that images of female professionals (scientists, doctors, executives) are likely to be associated with sexual descriptions relative to images of male professionals. A fourth experiment shows that a prompt of "a [age] year old girl" generates sexualized images (as determined by an NSFW classifier) up to 73% of the time for VQGAN-CLIP (age 17), and up to 40% of the time for Stable Diffusion (ages 14 and 18); the corresponding rate for boys never surpasses 9%. The evidence indicates that language-vision AI models trained on automatically collected web scrapes learn biases of sexual objectification, which propagate to downstream applications.
translated by 谷歌翻译
Electronic health records (EHR) offer unprecedented opportunities for in-depth clinical phenotyping and prediction of clinical outcomes. Combining multiple data sources is crucial to generate a complete picture of disease prevalence, incidence and trajectories. The standard approach to combining clinical data involves collating clinical terms across different terminology systems using curated maps, which are often inaccurate and/or incomplete. Here, we propose sEHR-CE, a novel framework based on transformers to enable integrated phenotyping and analyses of heterogeneous clinical datasets without relying on these mappings. We unify clinical terminologies using textual descriptors of concepts, and represent individuals' EHR as sections of text. We then fine-tune pre-trained language models to predict disease phenotypes more accurately than non-text and single terminology approaches. We validate our approach using primary and secondary care data from the UK Biobank, a large-scale research study. Finally, we illustrate in a type 2 diabetes use case how sEHR-CE identifies individuals without diagnosis that share clinical characteristics with patients.
translated by 谷歌翻译
Machine learning models are now able to convert user-written text descriptions into naturalistic images. These models are available to anyone online and are being used to generate millions of images a day. We investigate these models and find that they amplify dangerous and complex stereotypes. Moreover, we find that the amplified stereotypes are difficult to predict and not easily mitigated by users or model owners. The extent to which these image-generation models perpetuate and amplify stereotypes and their mass deployment is cause for serious concern.
translated by 谷歌翻译
评估了三种最先进的语言和图像AI模型,即剪辑,滑移和BLIP,以证明以前在社会和实验心理学中观察到的偏见:将美国身份等同于白人。使用芝加哥面部数据库(CFD)的自我识别的亚洲,黑人,拉丁裔和白人的标准化图像的嵌入关联测试(eats)表明,白人与集体内词相比,比亚洲,黑色更相关,或拉丁裔/o个人。在评估社会心理学家报道的美国身份的三个核心方面时,单类饮食表明,白人个体的图像与爱国主义和出生在美国更相关,但与心理学的先前发现一致,白人个人是相关的不太可能平等对待所有种族和背景的人。三个下游机器学习任务表明了与白人相关联的偏见。在使用BLIP的视觉问题回答任务中,有97%的白人被确定为美国人,而仅3%的亚洲人。当被问及个人所描绘的生活状态时,该模型在亚洲人中有53%的时间回应中国,但始终具有美国对白人个人的国家。在图像字幕的任务中,Blip评论了亚洲人的种族多达36%的时间,但从未对白人人士进行比赛。最后,使用基于文本的剪辑指导的综合图像发生器(VQGAN)提供了CFD和文本“ American Person”的初始化图像,从而减轻了所有种族个体的肤色(黑人个人的肤色为35% ,基于像素亮度)。结果表明,语言和图像AI将其等同于美国身份与白人的偏见,并传播到此类模型的下游应用。
translated by 谷歌翻译
在线旅行社(OTA)的网站在元搜索竞标引擎上宣传。预测酒店将收到的单击数量的给定出价金额的问题是管理元搜索引擎上OTA广告活动的重要一步,因为出价时间的点击次数定义了要生成的成本。在这项工作中,各种回归器都结束了,以提高点击预测性能。按照预处理程序,将功能集分为火车和测试组,具体取决于样品的记录日期。然后,将数据收集进行基于XGBoost的缩小降低,从而大大降低了特征的维度。然后通过将贝叶斯高参数优化应用于XGBoost,LightGBM和SGD模型来找到最佳的高参数。单独测试了十种不同的机器学习模型,并将它们组合在一起以创建合奏模型。提出了三种替代合奏解决方案。相同的测试集用于测试单个和集合模型,46个模型组合的结果表明,堆栈集合模型得出所有的R2分数。总之,整体模型将预测性能提高了约10%。
translated by 谷歌翻译